Audio Segmentation using Line Spectral Pairs
نویسندگان
چکیده
This paper describes a technique for unsupervised audio segmentation. Main objective of the work presented in this paper is to study the performance of audio segmentation system using metric-based method. The system first classifies the audio signal into speech and nonspeech signal using variance of zero crossing rate. The feature Line spectral pair is used for automatically detecting the speaker change point. Hotelling T distance metric is used in the first stage for coarse speaker change detection. The Bayesian information criterion (BIC) is used in the second stage to validate the potential speaker change point detected by the coarse segmentation procedure to reduce the false alarm rate. Database of four files containing the speech recorded from different combinations of male and female speakers mixed with nonspeech signal such as music and environmental sound are used for segmentation. The database-file with one male and one female gives the best performance with F1 measure of 0.9474.
منابع مشابه
On the duality between line-spectral frequencies and zero-crossings of signals
Line spectrum pairs (LSPs) are the roots (located in the complex-frequency or -plane) of symmetric and antisymmetric polynomials synthesized using a linear prediction (LPC) polynomial. The angles of these roots, known as line-spectral frequencies (LSFs), implicitly represent the LPC polynomial and hence the spectral envelope of the underlying signal. By exploiting the duality between the time a...
متن کاملInterpolation of Long Gaps in Audio Signals Using Line Spectrum Pair Polynomials
This technical report addresses model-based interpolation of long signal gaps. It demonstrates that employing a modified autoregressive AR model, computed as a weighted sum of line spectral pair (LSP) polynomials, is more efficient computationally than using a conventional AR model, since longer signal gaps can be interpolated at reduced model order. Key-words: acoustic signal processing, audio...
متن کاملThe Impact of the Spectral Filter Bandwidth on the Spectral Entanglement and Indistinguishability of Photon Pairs of SPDC Process
In this paper, we have investigated the dependence of the spectral entanglement and indistinguishability of photon pairs produced by the spontaneous parametric down-conversion (SPDC) procedure on the bandwidth of spectral filters used in the detection setup. The SPDC is a three-wave mixing process which occurs in a nonlinear crystal and generates entangled photon pairs and utilizes as one of th...
متن کاملCrosscorrelation-based multispeaker speech activity detection
We propose an algorithm for segmenting multispeaker meeting audio, recorded with personal channel microphones, into speech and non-speech intervals for each microphone’s wearer. An algorithm of this type turns out to be necessary prior to subsequent audio processing because, in spite of close-talking microphones, the channels exhibit a high degree of crosstalk due to unbalanced calibration and ...
متن کاملContent analysis for audio classification and segmentation
In this paper, we present our study of audio content analysis for classification and segmentation, in which an audio stream is segmented according to audio type or speaker identity. We propose a robust approach that is capable of classifying and segmenting an audio stream into speech, music, environment sound, and silence. Audio classification is processed in two steps, which makes it suitable ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012